Goto

Collaborating Authors

 kernel 0


A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values

arXiv.org Artificial Intelligence

Shapley values have emerged as a critical tool for explaining which features impact the decisions made by machine learning models. However, computing exact Shapley values is difficult, generally requiring an exponential (in the feature dimension) number of model evaluations. To address this, many model-agnostic randomized estimators have been developed, the most influential and widely used being the KernelSHAP method (Lundberg & Lee, 2017). While related estimators such as unbiased KernelSHAP (Covert & Lee, 2021) and LeverageSHAP (Musco & Witter, 2025) are known to satisfy theoretical guarantees, bounds for KernelSHAP have remained elusive. We describe a broad and unified framework that encompasses KernelSHAP and related estimators constructed using both with and without replacement sampling strategies. We then prove strong non-asymptotic theoretical guarantees that apply to all estimators from our framework. This provides, to the best of our knowledge, the first theoretical guarantees for KernelSHAP and sheds further light on tradeoffs between existing estimators. Through comprehensive benchmarking on small and medium dimensional datasets for Decision-Tree models, we validate our approach against exact Shapley values, consistently achieving low mean squared error with modest sample sizes. Furthermore, we make specific implementation improvements to enable scalability of our methods to high-dimensional datasets. Our methods, tested on datasets such MNIST and CIFAR10, provide consistently better results compared to the KernelSHAP library.


Hidden Question Representations Tell Non-Factuality Within and Across Large Language Models

arXiv.org Artificial Intelligence

Despite the remarkable advance of large language models (LLMs), the prevalence of non-factual responses remains a common issue. This work studies non-factuality prediction (NFP), which predicts whether an LLM will generate non-factual responses to a question before the generation process. Previous efforts on NFP usually rely on extensive computation. In this work, we conduct extensive analysis to explore the capabilities of using a lightweight probe to elicit ``whether an LLM knows'' from the hidden representations of questions. Additionally, we discover that the non-factuality probe employs similar patterns for NFP across multiple LLMs. Motivated by the intriguing finding, we conduct effective transfer learning for cross-LLM NFP and propose a question-aligned strategy to ensure the efficacy of mini-batch based training.


Graph Edits for Counterfactual Explanations: A Unified GNN Approach

arXiv.org Artificial Intelligence

Counterfactuals have been established as a popular explainability technique which leverages a set of minimal edits to alter the prediction of a classifier. When considering conceptual counterfactuals, the edits requested should correspond to salient concepts present in the input data. At the same time, conceptual distances are defined by knowledge graphs, ensuring the optimality of conceptual edits. In this work, we extend previous endeavors on conceptual counterfactuals by introducing \textit{graph edits as counterfactual explanations}: should we represent input data as graphs, which is the shortest graph edit path that results in an alternative classification label as provided by a black-box classifier?


Kernel Estimates as General Concept for the Measuring of Pedestrian Density

arXiv.org Artificial Intelligence

The standard definition of pedestrian density produces scattered values, hence, many approaches have been developed to improve the features of the estimated density. This paper provides a review of generally applied methods and presents a general framework based on various kernels that bring desired properties of density estimates (e.g., continuity) and incorporate ordinarily used methods. The developed kernel concept considers each pedestrian as a source of density distribution, parametrized by the kernel type (e.g., Gauss, cone) and kernel size. The quantitative parametric study performed on experimental data illustrates that parametrization brings desired features, for instance, a conic kernel with a base radius in (0.7, 1.2) m produces smooth values that retain trend features. The correspondence between kernel and non-kernel methods (namely Voronoi diagram and customized inverse distance to the nearest pedestrian) is achievable for a wide range of kernel parameter. Thereby the generality of the concept is supported.


Semi-supervised Protein Classification Using Cluster Kernels

Neural Information Processing Systems

A key issue in supervised protein classification is the representation of input sequences of amino acids. Recent work using string kernels for protein data has achieved state-of-the-art classification performance. However, such representations are based only on labeled data -- examples with known 3D structures, organized into structural classes -- while in practice, unlabeled data is far more plentiful. In this work, we develop simple and scalable cluster kernel techniques for incorporating unlabeled data into the representation of protein sequences. We show that our methods greatly improve the classification performance of string kernels and outperform standard approaches for using unlabeled data, such as adding close homologs of the positive examples to the training data. We achieve equal or superior performance to previously presented cluster kernel methods while achieving far greater computational efficiency.


Semi-supervised Protein Classification Using Cluster Kernels

Neural Information Processing Systems

A key issue in supervised protein classification is the representation of input sequencesof amino acids. Recent work using string kernels for protein datahas achieved state-of-the-art classification performance. However, suchrepresentations are based only on labeled data -- examples with known 3D structures, organized into structural classes -- while in practice, unlabeled data is far more plentiful. In this work, we develop simpleand scalable cluster kernel techniques for incorporating unlabeled datainto the representation of protein sequences. We show that our methods greatly improve the classification performance of string kernels andoutperform standard approaches for using unlabeled data, such as adding close homologs of the positive examples to the training data. We achieve equal or superior performance to previously presented cluster kernel methods while achieving far greater computational efficiency.